Source normalization for language-independent speaker recognition using i-vectors

نویسندگان

  • Mitchell McLaren
  • Miranti Indar Mandasari
  • David A. van Leeuwen
چکیده

Source-normalization (SN) is an effective means of improving the robustness of i-vector-based speaker recognition for under-resourced and unseen cross-speech-source evaluation conditions. The technique of source-normalization estimates directions of undesired within-speaker variation more accurately than traditional methods when cross-source variation is not explicitly observed from each speaker in system development data. Source normalization can be incorporated into Within Class Covariance Normalization (WCCN) as an effective preprocessing step to Probabilistic Linear Discriminant Analysis (PLDA) based speaker recognition with i-vectors. This paper proposes to extend the application of sourcenormalization to the reduction of language-dependence in PLDA speaker recognition by normalising for the variation that separates languages. Evaluated on the NIST 2008 and 2010 speaker recognition evaluation (SRE) data sets, the proposed Language Normalized WCCN (LN-WCCN) provides relative improvements of 26% in minimum DCF and 14% in EER under multilingual scenarios without detriment to common Englishonly conditions. LN-WCCN is also shown to significantly improve calibration performance when calibration parameters are learned from scores mismatched to evaluation conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A PLDA approach for language and text independent speaker recognition

There are many factors affecting the variability of an i-vector extracted from a speech segment such as the acoustic content, segment duration, handset type and background noise. The state-of-the-art Probabilistic Linear Discriminant Analysis (PLDA) aims at modelling all these sources of undesirable variability within a single covariance matrix. Although techniques such as source normalization ...

متن کامل

Content Normalization for Text-independent Speaker Verification

In the past few years, Deep Neural Network (DNN) based ivector Speaker Verification (SV) systems have shown to provide state-of-the-art performance. However, error rates increase drastically for short duration recordings. In this paper, we improve the i-vector approach for short utterances, (i) by using smoothed DNN posteriors for i-vector extraction, and (ii) by normalizing the content of the ...

متن کامل

Connectionist Speaker Normalization and Its Applications to Speech Recognition

Speaker normalization may have a significant impact on both speakeradaptive and speaker-independent speech recognition. In this paper, a codeworddependent neural network (CDNN) is presented for speaker normalization. The network is used as a nonlinear mapping function to transform speech data between two speakers. The mapping function is characterized by two important properties. First, the ass...

متن کامل

Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques

A significant amount of speech is typically required for speaker verification system development and evaluation, especially in the presence of large intersession variability. This paper introduces a source and utterance-duration normalized linear discriminant analysis (SUN-LDA) approaches to compensate session variability in short-utterance i-vector speaker verification systems. Two variations ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012